Weblog Clustering in Multilinear Algebra Perspective

نویسنده

  • Andri Mirzal
چکیده

This paper describes a clustering method to group the most similar and important weblogs with their descriptive shared words by using a technique from multilinear algebra known as PARAFAC tensor decomposition. The proposed method first creates labeled-link network representation of the weblog datasets, where the nodes are the blogs and the labels are the shared words. Then, 3-way adjacency tensor is extracted from the network and the PARAFAC decomposition is applied to the tensor to get pairs of node lists and label lists with scores attached to each list as the indication of the degree of importance. The clustering is done by sorting the lists in decreasing order and taking the pairs of top ranked blogs and words. Thus, unlike standard co-clustering methods, this method not only groups the similar blogs with their descriptive words but also tends to produce clusters of important blogs and descriptive words.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilinear Complexity is Equivalent to Optimal Tester Size

In this paper we first show that Tester for an F-algebra A and multilinear forms, [2], is equivalent to multilinear algorithm for the product of elements in A, [3]. Our result is constructive in deterministic polynomial time. We show that given a tester of size ν for an F-algebra A and multilinear forms of degree d one can in deterministic polynomial time construct a multilinear algorithm for t...

متن کامل

A Fuzzy Grassroots Ontology for improving Weblog Extraction

This paper presents fuzzy clustering algorithms to establish a grassroots ontology – a machine-generated weak ontology – based on folksonomies. Furthermore, it describes a search engine for vaguely associated terms and aggregates them into several meaningful cluster categories, based on the introduced weak grassroots ontology. A potential application of this ontology, weblog extraction, is illu...

متن کامل

Identifying Network Anomalies Using Clustering Technique in Weblog Data

In this paper we present an approach for identifying network anomalies by visualizing network flow data which is stored in weblogs. Various clustering techniques can be used to identify different anomalies in the network. Here, we present a new approach based on simple K-Means for analyzing network flow data using different attributes like IP address, Protocol, Port number etc. to detect anomal...

متن کامل

Weblog success: Exploring the role of technology

Weblogs have recently gained considerable media attention. Leading weblog sites are already attracting millions of visitors. Yet, success in the highly competitive world of weblogs is not easily achieved. This study seeks to explore weblog success from a technology perspective, i.e. from the impact of weblog-building technology (or blogging tool). Based on an examination of 126 highly successfu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0909.2345  شماره 

صفحات  -

تاریخ انتشار 2009